Dyna-H: a heuristic planning reinforcement learning algorithm applied to role-playing-game strategy decision systems

نویسندگان

Matilde Santos Peñas

José Antonio Martín H.

Victoria López

چکیده

In a Role-Playing Game, finding optimal trajectories is one of the most important tasks. In fact, the strategy decision system becomes a key component of a game engine. Determining the way in which decisions are taken (online, batch or simulated) and the consumed resources in decision making (e.g. execution time, memory) will influence, in mayor degree, the game performance. When classical search algorithms such as A∗ can be used, they are the very first option. Nevertheless, such methods rely on precise and complete models of the search space, and there are many interesting scenarios where their application is not possible. Then, model free methods for sequential decision making under uncertainty are the best choice. In this paper, we propose a heuristic planning strategy to incorporate the ability of heuristic-search in path-finding into a Dyna agent. The proposed Dyna-H algorithm, as A∗ does, selects branches more likely to produce outcomes than other branches. Besides, it has the advantages of being a modelfree online reinforcement learning algorithm. The proposal was evaluated against the one-step Q-Learning and Dyna-Q algorithms obtaining excellent experimental results: Dyna-H significatively overcomes both methods in all experiments. We suggest also, a functional analogy between the proposed sampling from worst trajectories heuristic and the role of dreams (e.g. nightmares) in human behavior.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Wp-dyna: Planning and Reinforcement Learning in Well-plannable Environments

Reinforcement learning (RL) involves sequential decision making in uncertain environments. The aim of the decision-making agent is to maximize the benefit of acting in its environment over an extended period of time. Finding an optimal policy in RL may be very slow. To speed up learning, one often used solution is the integration of planning, for example, Sutton’s Dyna algorithm, or various oth...

متن کامل

Integrated Architectures for Learning , Planning , and ReactingBased

This paper extends previous work with Dyna, a class of architectures for intelligent systems based on approximating dynamic programming methods. Dyna architectures integrate trial-and-error (reinforcement) learning and execution-time planning into a single process operating alternately on the world and on a learned model of the world. In this paper, I present and show results for two Dyna archi...

متن کامل

Reinforcement Learning–Based Energy Management Strategy for a Hybrid Electric Tracked Vehicle

This paper presents a reinforcement learning (RL)–based energy management strategy for a hybrid electric tracked vehicle. A control-oriented model of the powertrain and vehicle dynamics is first established. According to the sample information of the experimental driving schedule, statistical characteristics at various velocities are determined by extracting the transition probability matrix of...

متن کامل

Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming

This paper extends previous work with Dyna a class of architectures for intelligent systems based on approximating dynamic program ming methods Dyna architectures integrate trial and error reinforcement learning and execution time planning into a single process operating alternately on the world and on a learned model of the world In this paper I present and show results for two Dyna archi tect...

متن کامل

Competitive Reinforcement Learning for Combinatorial Problems

This paper shows that the competitive learning rule found in Learning Vector Quantization (LVQ) serves as a promising function approximator to enable reinforcement learning methods to cope with a large decision search space, defined in terms of different classes of input patterns, like those found in the game of Go. In particular, this paper describes S[arsa]LVQ, a novel reinforcement learning ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

Knowl.-Based Syst.

دوره 32 شماره

صفحات -

تاریخ انتشار 2012

Dyna-H: a heuristic planning reinforcement learning algorithm applied to role-playing-game strategy decision systems

نویسندگان

چکیده

منابع مشابه

Wp-dyna: Planning and Reinforcement Learning in Well-plannable Environments

Integrated Architectures for Learning , Planning , and ReactingBased

Reinforcement Learning–Based Energy Management Strategy for a Hybrid Electric Tracked Vehicle

Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming

Competitive Reinforcement Learning for Combinatorial Problems

عنوان ژورنال:

اشتراک گذاری